Scalable Maintenance of Multiple Interrelated Data Warehousing Systems
نویسندگان
چکیده
The maintenance of data warehouses(DWs) is becoming an increasingly important topic due to the growing use, derivation and integration of digital information. Most previous work has dealt with one centralized data warehouse only. In this paper, we now focus on environments with multiple DWs that are possibly derived from other DWs. In such a large-scale environment, data updates from base sources may arrive in individual data warehouses in diierent orders, thus resulting in inconsistent data warehouse extents. We propose to address this problem by employing a registry agent responsible for establishing one unique order for the propagation of updates from the base sources to the DWs. With this solution, individual DW managers can still maintain their respective extents autonomously and independently from each other, thus allowing them to apply any existing incremental maintenance algorithm from the literature. We demonstrate that this registry-based coordination approach (RyCo) indeed achieves consistency across all DWs.
منابع مشابه
Multiple View Consistency for Data Warehousing
A data warehouse stores integrated information from multiple distributed data sources. In effect, the warehouse stores materialized views over the source data. The problem of ensuring data consistency at the warehouse can be divided into two components: ensuring that each view reflects a consistent state of the base data, and ensuring that multiple views are mutually consistent. In this paper w...
متن کاملIncremental Maintenance of Object-Oriented Views in a Warehousing Environment
Data warehousing is an approach to data integration in which integrated information is stored in a data warehouse for direct querying and analysis. To provide fast access, a data warehouse stores materialized views defined over data from its data sources. As a result, a data warehouse needs to be maintained to keep its contents consistent with the contents of its data sources. Incremental maint...
متن کاملArchitecture of a Highly Scalable Data Warehouse Appliance Integrated to Mainframe Database Systems
Main memory processing and data compression are valuable techniques to address the new challenges of data warehousing regarding scalability, large data volumes, near realtime response times, and the tight connection to OLTP. The IBM Smart Analytics Optimizer (ISAOPT) is a data warehouse appliance that implements a main memory database system for OLAP workloads using a cluster-based architecture...
متن کاملMesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing
Mesa is a highly scalable analytic data warehousing system that stores critical measurement data related to Google’s Internet advertising business. Mesa is designed to satisfy a complex and challenging set of user and systems requirements, including near real-time data ingestion and queryability, as well as high availability, reliability, fault tolerance, and scalability for large data and quer...
متن کاملMessage from the VLDWH Workshop Chair
ver the last few years data warehousing, data mining and analytical online processing emerged to form a new paradigm for the storage and analysis of data in business environments and increasingly in scientific environments. With the use of these technologies in a wide variety of application, the necessity for highly scalable and fast data management tools capable of handling multiple terabytes ...
متن کامل